home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Linux Cubed Series 7: Sunsite
/
Linux Cubed Series 7 - Sunsite Vol 1.iso
/
system
/
admin
/
linuxcon.000
/
linuxcon
/
linuxconf-1.6
/
translate
/
translat.doc
< prev
next >
Wrap
Text File
|
1996-04-14
|
17KB
|
443 lines
Translation system for Linuxconf
Introduction
Linuxconf is a large software component, full of menus, and dialogs.
To be easily translatable, all messages must be extracted from the C++
source code and place into dictionnaries which can be translated effi-
ciently. A special set of tools has been designed to achieve this.
They are described here.
1. Introduction
This document describes both how the system works and how translators
can use it. It starts by explaining how programmers can use it to
produce translatable programs. The section "how to translate" explains
how translators can use this system to translate linuxconf or any
programs written using this system.
2. Principles
To make programs easily translatable, all messages should be placed in
dictionnaries. A dictionnary is made of message entries. Each message
has a unique ID and a value. In the C++ source, programmers are
refering to those messages using the ID whenever they want to print or
say something.
Each time a programmer need a new message, he has to add it in the
message dictionnary and reference it from the C++ source code. This is
how most system works (There are other translation system out there).
The system used by Linuxconf is basically different. Messages are
defined in the C++ source code and the dictionnaries are built by
scanning all C++ source files. Messages are defined in the C++ code.
Programmers must provide and ID and a value for each message right in
the source code. This is much easier (or nicer) to do this right in
the source code than to go back and forth in the dictionnary.
Furthermore, the programmer directly see the message definition in the
source. With other system, only the message ID is visible in the
source.
Using the magic of the C preprocessor, the message value is not
compiled in the object code at all. Seen this way, the translation
system used by Linuxconf yield the same result as other system. It is
just nicer to use for programmers.
Lets describe how a programmer use the system.
2.1. One dictionnary per source directory
It is best to define one message dictionnary per sub-project or sub-
directory. This is easier to manage and avoid ID name space
congestion. For each directory source of Linuxconf you have one "dic"
file and one "m" file. Both file are produced simply by doing
make msg
This command scans all C++ source file of the current directory and
update the file ../messages/sources/DIRECTORY.dic and the file
DIRECTORY.m, where DIRECTORY is the name of the current directory.
make msg use the ../translate/msgscan utility to scan the source. This
utility looks for specific constructs in the C++ source file. Here
they are.
2.2. The MSG_U macro
The MSG_U macro defines a new message. It defines both its ID and its
value. This macro is usable anywhere a C++ string would be.
#include "prjfoo.m"
int foo()
{
printf (MSG_U(M_MSG1,"Entering function foo"));
}
MSG_U defines a single value. U stands for unilingual. It only defines
one value.
2.3. The MSG_B macro
The MSG_B macro is like the MSG_U macro, except it defines two values,
allowing a programmer to code immediatly two languages at once. The B
stands for bilingual. This has not been used in the Linuxconf project
but has proven effective for other projects.
#include "prjfoo.m"
int foo()
{
printf (MSG_U(M_MSG1
,"Entering function foo\n"));
,"DΘmarrage de la fonction foo\n"));
}
2.4. The MSG_R macro
The MSG_R macro simply references an already defined message. This
message may have been defined in another source file (of the same
project). Like the other macros, MSG_R may be used anywhere a C++
string is.
2.5. The MSG_VERSION macro
This macro has not been used so far. It would allow one programmer to
raise the version number of a dictionnary, preventing older
application to use the newer potentially incompatible dictionnary.
The msgclean utility also plays with the version number of the
dictionnary. The MSG_VERSION macro is still a concept rather than a
useful addition. Stay tune...
2.6. The magic of the MSG_ macros
The MSG_ macros perform two tasks. First, they are easily spotted by
the msgscan utility. The parsing is simple and reliable even if the
C++ source code is not functionnal. Second, they hide the retrieval
mecanism (How the message value is retrieved from the binary
dictionnary at runtime).
The msgscan utility produce the .m file which looks like this for the
simple example above.
FILE prjfoo.m:
extern const char **_dictionnary_prjfoo;
#ifndef DICTIONNARY_REQUEST
#define DICTIONNARY_REQUEST \
const char **_dictionnary_prjfoo;\
TRANSLATE_SYSTEM_REQ _dictionnary_req_prjfoo\
("prjfoo",_dictionnary_prjfoo,55,1);\
void dummy_dict_prjfoo(){}
#endif
#ifndef MSG_U
#define MSG_U(id,m) id
#define MSG_B(id,m,n) id
#define MSG_R(id) id
#endif
#define M_MSG1 _dictionnary_prjfoo[0]
As you see, one global variable is created: _dictionnary_prjfoo. A
special macro DICTIONNARY_REQUEST is defined. This macro should be
placed in one source of the project. It is generally place in the file
_dict.c presented later.
3. How to use it
To produce a translatable program, do the following
╖ Replace all string message with MSG_U or MSG_B macros, giving each
message a unique ID.
╖ include (#include) the .m file in each source file using the MSG_x
macros. This file is generally named directory.m where directory is
the name of the current directory.
╖ Create a file _dict.c. The content of this file is shown below.
╖ Use "make msg" to extract the messages. This produces/updates the
dictionnary file directory.dic and produces the include file
directory.m.
╖ Compile and link your program.
╖ Use "make msg.eng" to produce the english binary dictionnary. The
file produced should be placed where your program expects it.
We will now describe further the different steps involved.
3.1. The make msg command and msgscan utility
The make msg command invokes the msgscan utility. This utility scan a
set of C or C++ source file, updates a dictionnary file and produces
one include file.
Here is the command use to update the dictionnary of the sub-project
uucp of the Linuxconf project.
../translate/msgscan uucp \
../messages/sources/uucp.dic uucp.m EF *.c
The first argument is the name of the dictionnary. The second argument
is the path of the dictionnary file. As you see, dictionnary file are
kept in a single directory for all projects. They are seldom. This
eases the works of translators. The third argument is the path of the
include file, which is produced in the current directory.
The fourth argument is the letter tags used to identify messages
defined with the macro MSG_U and MSG_B. Messages defined with MSG_U
will be tagged with the letter E (English) and messages defined with
MSG_B will be tagged with E for the first value and F (French) for the
second.
3.2. The _dict.c file
It is good pratice to place the DICTIONNARY_REQUEST macro in a file
_dict.c. There is generally one such a file per directory. Its
contents is generally:
#include "this_directory.m"
#include <translat.h>
DICTIONNARY_REQUEST
At least this dependancy should be placed in your makefile
_dict.o: _dict.c this_directory.m
This will ensure that each time you update your dictionnary (and the m
header file), _dict.c will be recompile, ensuring proper recording of
the dictionnary revision and number of message. This will avoid
executing a program with an obsolete or incompatible binary
dictionnary.
Given that _dict.c is small, the compilation is pretty short.
3.3. The msgcomp utility
Once you have compiled and linked your program, you must "compiled"
all the dictionnaries used in your program into one binary
dictionnary. This is done by the msgcomp utility. Here is the command
used when doing "make msg.eng" for the Linuxconf project. This
produces the english binary dictionnary.
../translate/msgcomp -p../messages/sources/ \
/tmp/linuxconf-msg-1.3.eng eE \
askrunlevel dialog dnsconf fstab \
misc main netconf mailconf uucp userconf
This commands take all dictionnaries for sub-projects askrunlevel
dialog dnsconf fstab misc main netconf mailconf uucp and userconf and
produce a single binary dictionnary.
The -p option tells msgcomp to look for those dic files (
askrunlevel.dic dialog.dic ...) in the directory
../messages/sources/.
The argument /tmp/linuxconf-msg-1.3.eng is the file to produce. The
argument eE instructs msgcomp to extract message'values with the 'e'
tag. If there is no such value for a given message, the value with the
'E' tag will be used.
3.3.1. Convention used for tags
Dictionnary file contain the definition for all messages. Each
messages may have different values, identified by a tag letter. When
messages are extracted by msgscan, it is instructed to associate
values with given tags. By convention, we use upper case letter to
identify message's value extracted from the source code. Lower case
value are used by translators.
We assume here that programmers are bad writters. We let them give
their best shots for messages and we are allowed to override their
work without overwriting it. By giving precedence to 'e' tags over 'E'
we are saying that translators work override the work of programmers,
but we are not forcing the translators to rewrite everything.
3.4. The msgclean utility
The msgscan utility maintains dictionnary. At some point some messages
may become obsolete (Unused in any source files). The msgclean is used
to clean messages without values in the dic file.
For the Linuxconf project, the make target msg.clean is defined for
that purpose.
Be aware that applying msgclean on a dictionnary file with obsolete
message has an important side effect. Some message being deleted, the
numbering of all following message will be changed. All source using
the m include file should be recompiled.
To avoid problems, the msgclean utility automaticly increases the
revision number of the dictionnary. This prevents using a dictionnary
with an incompatible program.
4. Usage restriction
The stategy used is mainly targetted at C++ code. With some
restriction, it may be used for C code. Here are the main feature that
probably don't work with C.
static initialisation
In C++ one can write the following code.
static char *tb[]={
foo(1),foo(22)
};
where foo is a function. The C++ compiler will generate the proper
code which will be probably called once. The MSG_U macro (and
others) are not hiding function call, but are indeed dynamic in
some sens. C does not support this. Other translation strategy
based on dictionnary do have the same limitation though.
The exemple using the static char *tb[] is also causing a problem in
C++ if the variable is declared outside of a function. The problem
appear because the "hidden" initialisation code generated by the
compiler is called very early, often before main() is called.
Normally, the function translat_load() which bring the dictionnary in
memory is called by main().
Fortunatly, the current implementation, where _dictionnary_system is a
pointer will trigger a seg fault whenever this condition is met. This
fault will be trigger all the time, because all initialisation are
called before main. The strategy is safe.
5. Recommend usage and convention
5.1. Naming convention for message's ID
To help peoples who will translat your Linuxconf, I have used a
convention for the ID's name.
B_ Buttons.
E_ Error message start with this.
F_ Field labels start with this.
I_ Dialog instroduction start with this.
M_ All menu entries start with this prefix.
N_ Notices and warning start with this.
P_ When the user is prompted for a password, the message's ID start
with this.
Q_ Identify a question (Generally a Yes/No prompt).
T_ Dialog's title start with this.
X_ All other messages which fit in no category.
6. How to translate
6.1. Go simple
One way to translate is to go right in the .dic files and add
translations for each message using a different tag. Then use the
msgcomp utility to extract the proper definition.
At first, there is little problem doing this. The msgscan utility
read,update and save the .dic file, so your changes won't be lost.
The problem come from the way software is developped. First we develop
and then, when it is stable, we translate. Doing so mean that we have
to walk all the .dic files to make sure our translation still fit with
the original messages (English version for example). Those original
messages may have changed.
A different scheme was choosen for Linuxconf.
6.2. Organisation of the messages directory
The messages directory contain one subdirectory per language plus one
sources directory. This directory contains all the These file are
never hand edited.
Each other directory has a copy of those .dic files with the proper
translation. A special utility msgupd has been created: it basicly
compared all messages in the sources directory with messages in the
translated directory. It compare only one language (say the english
version).
Mostly, msgupd will tell you
╖ Which messages are new.
╖ Which messages have changed (The english wording).
Using that information, you know exactly what you have to do to keep
your work in sync with the current release of Linuxconf. msgupd will
reorder the translated .dic file (Not the one in the sources
directory) so all messages which needed work are at the beginning of
the file. It also add a comment (.dic files may have comments like
most normal Unix configuration file) explaining what have to be done.
If the english version of the message was changed, it will retag the
version in the translated file and add the new version, plus a
comment. The old english message will have the tag "Z". You can see
easily what is the change.
6.3. The msgupd utility
The file rules.mak shows the rules for one translation (which is not
done yet). Look for the target msg.cfr and upd.cfr. To add a new
language, do this
╖ Create a new directory empty in the messages directory, for
example, mar for Alien language.
╖ Customise rules.mak and add the target msg.mar and upd.mar.
╖ Run the following command. This will fill the messages/mar
directory with all the necessary .dic files.
make upd.mar
╖ Go into messages/mar and edit each .dic file and add the proper
translation as needed.
╖ Run the following command to produce the binary dictionnary
required to run Linuxconf.
make msg.mar
╖ Set the following environnement variable and run Linuxconf.
╖ export LINUXCONF_LANG=mar
╖ export LINUXCONF_DICT=/tmp
This variable is optionnal. Linuxconf will normally look for its
message dictionnary in /usr/lib/linuxconf. This variable override
this. The msg.* makefile's target generally produce their output
in /tmp. This is useful to test new messages without breaking the
current installation of Linuxconf.
Be aware that this mecanism only work if you execute Linuxconf as
root. For security reason, a normal user can't override the message
dictionnary of Linuxconf (Although he can select a different
language from /usr/lib/linuxconf if available).
6.4. The msgcomp utility
The msgcomp utility has been tweaked to support the distribute
directory concept. Mainly it use the .dic file in the sources
directory as a reference. Message number ID are defined from this
file. It then used (optionnally) alternative
7. Licensing
The translate directory is part of the Linuxconf project but carry a
special license. There is no resctriction on usage. Feel free to
incorporate this system to any project.
This simple license does not apply to the rest of Linuxconf which is
covered by the standard GNU Copyleft license. See the file LICENSE in
the root directory.
If you find it useful for other project, send me a note and some
comments if possible.